16 March, 2016

GIS

Geographic Information Systems

GIS

Geographic Information Systems

  • incorporating
  • storing
  • manipulating
  • analyzing
  • displaying…

Spatial Data

What is spatial data?

Nonspatial data has no location information

nonspatial = data.frame(
  id=c(1,2,3,4),
  data=rnorm(4)
)
print(nonspatial)
##   id       data
## 1  1 -0.4931948
## 2  2  1.1307952
## 3  3 -0.3918342
## 4  4  0.6926913

What is spatial data?

Spatial data has location information

The simplest spatial data are points on a map

spatial = data.frame(
  id=c(1,2,3,4),
  data=rnorm(4),
  x=runif(4,-180,180),
  y=runif(4,-90,90)
)
print(spatial)
##   id       data         x         y
## 1  1 -1.2740971  51.22328  24.49623
## 2  2 -1.3815687 141.39977 -52.02908
## 3  3 -1.5401806 -39.14751  82.96313
## 4  4  0.1171307 162.79971  63.99383

What is spatial data?

Which we can convert to explicitly spatial data using the sp package. Most GIS packages in R store data as sp classes.

library(sp)

What is spatial data?

The sp package has a method called coordinates that converts points to an sp class.

coordinates(spatial) = ~ x + y
class(spatial)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
plot(spatial, axes=T)


What is spatial data?

Spatial data also needs a reference system or "projection" that allows us to represent spatial features on a map. Projections can be thought of as simply a coordinate system with an origin that is relative to a known point in space.

This is a whole field of mathematically intensive study termed "geodesy"

Much of the field of geodesy is jam-packed in the rgdal package, which is a wrapper for the Geospatial Data Abstration Library

library(rgdal)

What is spatial data?

rgdal includes a comprehensive list of projections that are typically represented as a string of parameters.

The most common is our standard latitude/longitude system, where the coordinates are angular and the origin is the equator directly south of Greenwich, England. The simplest projection string to denote this projections is:

"+proj=longlat"

To define the projection for spatial, we write to its proj4string slot:

proj4string(spatial) = "+proj=longlat"

Projections are a necessary evil for GIS users (to be continued)

What is spatial data?

With a projection associated with our spatial data, we can now relate it to other spatial data. In other words, let's make a map!

library(leaflet)
m = leaflet(data=spatial) %>%
  addTiles() %>%
  addMarkers()
m

What is spatial data?

Spatial data types

Spatial data types

Two main types: vector and raster

Vector = Polygons
Raster = Grid

Vector = Discrete
Raster = Continuous

Vector = Illustrator/Inkscape
Raster = Photoshop/GIMP







Vector Data

Vector Data Intro

  • How data is represented Vector data is represented by coordinate pairs,









Figure depicting * Points

  • Lines

  • Polygons

Vector Data Intro

Geometry is associated with other data, attribute data Conceptualize as a row in a table

  • What vector data is used for

Outline 2

Part I

  • introduce soils data, ways of looking at it

  • well data, read in and convert

  • plot the two, brief discuss issues with CRS matching

  • Example scenario, demo gIntersects and how to use introduce other types of spatial relations

  • Use over to extract attributes and select those above a threshold of sand

  • Introduce buffering and suggest trying with that

Vector Data Basics

Data Input/Output

library(rgdal)
soils = readOGR(
    dsn="data",
    layer="soilsData")
## OGR data source with driver: ESRI Shapefile 
## Source: "data", layer: "soilsData"
## with 75 features
## It has 27 fields
writeOGR(
    soils,
    "data",
    "soilsData_out",
    driver="ESRI Shapefile"
)

Vector Data Basics

Other ways of creating of spatial data from list of coordinates:

wells = read.delim("./data/WellLocations.tsv")
class(wells); head(wells)
## [1] "data.frame"
##           x        y pts.data.id
## 1 -90.05145 43.10047           1
## 2 -90.05553 43.10470           2
## 3 -90.07305 43.09013           3
## 4 -90.04716 43.08454           4
## 5 -90.07198 43.08850           5
## 6 -90.06599 43.09197           6
coordinates(wells) <- ~ x + y
class(wells)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"

Vector Data Basics

Helper functions:

class(soils)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
slotNames(soils)
## [1] "data"        "polygons"    "plotOrder"   "bbox"        "proj4string"
length(soils)
## [1] 75

Vector Data Basics

str(soils@data[,1:10])
## 'data.frame':    75 obs. of  10 variables:
##  $ mukey  : Factor w/ 25 levels "2774742","2774772",..: 12 19 12 6 9 5 7 24 25 13 ...
##  $ muarcrs: Factor w/ 75 levels "0.40538405","0.90105194",..: 30 62 70 18 8 26 19 5 15 23 ...
##  $ Sand1  : num  21.5 11 21.5 12 29.5 ...
##  $ Sand2  : num  35.29 8.34 35.29 9.02 38.22 ...
##  $ Sand3  : num  45.09 7.56 45.09 16.59 64.52 ...
##  $ Sand4  : num  48.6 30.6 48.6 36.7 33.2 ...
##  $ Sand5  : num  0 31.1 0 27.1 57.2 ...
##  $ Silt1  : num  45.8 65.2 45.8 68.7 54.5 ...
##  $ Silt2  : num  37.3 64.9 37.3 60.3 43.5 ...
##  $ Silt3  : num  36.7 64.1 36.7 34 21.1 ...

Vector Data Basics

str(soils@polygons[1])
## List of 1
##  $ :Formal class 'Polygons' [package "sp"] with 5 slots
##   .. ..@ Polygons :List of 1
##   .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
##   .. .. .. .. ..@ labpt  : num [1:2] 514199 291168
##   .. .. .. .. ..@ area   : num 10776
##   .. .. .. .. ..@ hole   : logi FALSE
##   .. .. .. .. ..@ ringDir: int 1
##   .. .. .. .. ..@ coords : num [1:21, 1:2] 514211 514206 514195 514178 514180 ...
##   .. ..@ plotOrder: int 1
##   .. ..@ labpt    : num [1:2] 514199 291168
##   .. ..@ ID       : chr "0"
##   .. ..@ area     : num 10776

Vector Data Basics

A number of common functions have methods for spatial data

silty = subset(soils, Silt1 > 70)
paste("There are", length(soils), "soil features total;")
## [1] "There are 75 soil features total;"
paste(length(silty), "with a silt percentage over 70")
## [1] "12 with a silt percentage over 70"

Vector Data Basics

Making simple maps is quite easy PUT ON NEXT PAGE

par(mfrow=c(1,2), bg=NA)
plot(
    soils,
    main="Soils Polygons",
    col=rainbow(5))
plot(
    wells,
    main="Well Data",
    col='red'
)












Coordinate Reference Systems

A (Very) Brief Break

A coordinate reference system (CRS) defines the surface of the world. There are many and if they don't match, errors and issues can arise

Coordinate Reference Systems

A (Very) Brief Break

To illustrate issues:

soils = readOGR(
    dsn="data",
    layer="soilsData")
plot(
    soils,
    main="Soils",
    col=rainbow(5)
)
plot(
    wells,
    add=T
)











Hmmm, where are the points?

Coordinate Reference Systems

A (Very) Brief Break

print(wells@proj4string)
## CRS arguments: NA
print(soils@proj4string)
## CRS arguments:
##  +proj=tmerc +lat_0=0 +lon_0=-90 +k=0.9996 +x_0=520000
## +y_0=-4480000 +datum=NAD83 +units=m +no_defs +ellps=GRS80
## +towgs84=0,0,0
print(coordinates(soils)[1:2]);print(coordinates(wells)[1:2]);
## [1] 514198.6 515864.4
## [1] -90.05145 -90.05553

Coordinate Reference Systems

A (Very) Brief Break

Solution: define the CRS then project the points to CRS of the soils data

wells@proj4string = CRS(
    "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
wells = spTransform(
    wells,
    soils@proj4string)
plot(
    soils,
    col=rainbow(5)
)
plot(
    wells,
    add=T
)











Vector Data Geometry Relationships

How do two shapes relate? Intersects for if the geometries have at least one point in common or no points in common (disjoint)












Vector Data Geometry Relationships

List other relational operators. Dave, also demo clipping or unioning or…some geometry manipulation Introduce using rgeos examples ## Vector Data Example {.smaller}

gIntersects(wells, soils)
## Error in RGEOSBinPredFunc(spgeom1, spgeom2, byid, func): TopologyException: side location conflict at 514345.89542264002 292885.20596594003

Vector Data Example

## Adding some magic...
soils = gBuffer(soils, width = 0, byid = T)
gIntersects(wells, soils, byid = T)
##        1     2     3     4     5     6     7     8     9    10    11    12
## 0  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 1  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 2  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 3  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 4  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 5  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 6  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 7  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 8  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 9  FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 10 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 11 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 12 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 13 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 14 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 15 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 16 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 17 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 18 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 19 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 20 FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## 21 FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 22 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 23 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 24 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 25 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 26 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 27 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 28 FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 29 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 30 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 31 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
## 32 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 33 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 34 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 35 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 36 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 37 FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 38 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 39 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 40 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 41 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 42 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 43 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 44 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 45 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## 46 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 47 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 48 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 49 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 50 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 51 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 52 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 53 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 54 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 55 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 56 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 57 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 58 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 59 FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 60 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 61 FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
## 62 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 63 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 64 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 65 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 66 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 67 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 68 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 69  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE FALSE
## 70 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 71 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 72 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 73 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 74 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##       13    14    15
## 0  FALSE FALSE FALSE
## 1  FALSE FALSE FALSE
## 2  FALSE FALSE FALSE
## 3  FALSE FALSE FALSE
## 4  FALSE FALSE FALSE
## 5  FALSE FALSE FALSE
## 6  FALSE FALSE FALSE
## 7  FALSE FALSE FALSE
## 8  FALSE FALSE FALSE
## 9  FALSE FALSE FALSE
## 10 FALSE FALSE FALSE
## 11 FALSE FALSE FALSE
## 12 FALSE FALSE FALSE
## 13 FALSE FALSE FALSE
## 14 FALSE FALSE FALSE
## 15 FALSE FALSE FALSE
## 16 FALSE FALSE FALSE
## 17 FALSE FALSE FALSE
## 18 FALSE FALSE FALSE
## 19 FALSE FALSE FALSE
## 20 FALSE FALSE FALSE
## 21 FALSE FALSE FALSE
## 22 FALSE FALSE FALSE
## 23 FALSE FALSE FALSE
## 24 FALSE FALSE FALSE
## 25 FALSE FALSE FALSE
## 26 FALSE FALSE FALSE
## 27 FALSE FALSE FALSE
## 28 FALSE FALSE FALSE
## 29 FALSE FALSE FALSE
## 30 FALSE FALSE FALSE
## 31 FALSE FALSE FALSE
## 32 FALSE FALSE FALSE
## 33  TRUE FALSE  TRUE
## 34 FALSE FALSE FALSE
## 35 FALSE FALSE FALSE
## 36 FALSE FALSE FALSE
## 37 FALSE FALSE FALSE
## 38 FALSE FALSE FALSE
## 39 FALSE FALSE FALSE
## 40 FALSE FALSE FALSE
## 41 FALSE FALSE FALSE
## 42 FALSE FALSE FALSE
## 43 FALSE FALSE FALSE
## 44 FALSE FALSE FALSE
## 45 FALSE FALSE FALSE
## 46 FALSE FALSE FALSE
## 47 FALSE FALSE FALSE
## 48 FALSE FALSE FALSE
## 49 FALSE FALSE FALSE
## 50 FALSE FALSE FALSE
## 51 FALSE FALSE FALSE
## 52 FALSE FALSE FALSE
## 53 FALSE FALSE FALSE
## 54 FALSE FALSE FALSE
## 55 FALSE  TRUE FALSE
## 56 FALSE FALSE FALSE
## 57 FALSE FALSE FALSE
## 58 FALSE FALSE FALSE
## 59 FALSE FALSE FALSE
## 60 FALSE FALSE FALSE
## 61 FALSE FALSE FALSE
## 62 FALSE FALSE FALSE
## 63 FALSE FALSE FALSE
## 64 FALSE FALSE FALSE
## 65 FALSE FALSE FALSE
## 66 FALSE FALSE FALSE
## 67 FALSE FALSE FALSE
## 68 FALSE FALSE FALSE
## 69 FALSE FALSE FALSE
## 70 FALSE FALSE FALSE
## 71 FALSE FALSE FALSE
## 72 FALSE FALSE FALSE
## 73 FALSE FALSE FALSE
## 74 FALSE FALSE FALSE

Vector Data Example

gIntersects(wells, soils, byid = T, returnDense = F)
## $`1`
## [1] 70
## 
## $`2`
## [1] 38
## 
## $`3`
## [1] 22
## 
## $`4`
## [1] 29
## 
## $`5`
## [1] 60
## 
## $`6`
## [1] 21
## 
## $`7`
## [1] 70
## 
## $`8`
## [1] 62
## 
## $`9`
## [1] 70
## 
## $`10`
## [1] 70
## 
## $`11`
## [1] 32
## 
## $`12`
## [1] 46
## 
## $`13`
## [1] 34
## 
## $`14`
## [1] 56
## 
## $`15`
## [1] 34

Vector Data Example

indxs = unique(unlist(gIntersects(wells, soils, byid = T, returnDense = F)))
soils_with_wells = soils[indxs, ]
plot(soils_with_wells, col = rainbow(5))
plot(wells, add = T, cex = 4)











Vector Data Example

Could iterate through intersect list or…

soilsDat = over(wells, soils)
class(soilsDat)
## [1] "data.frame"
soilsDat$id = 1:15

wells = merge(wells, soilsDat, by.x="pts.data.id", by.y="id")
head(wells@data)
##   pts.data.id   mukey      muarcrs    Sand1     Sand2    Sand3    Sand4
## 1           1 2806652 405.22965906 35.87900 32.747667 69.83400 73.27298
## 2           2 2806642 127.65139889 14.10700 14.001433 13.06847 14.28767
## 3           3 2774772  23.43535446 10.51290  9.529000 10.50538 24.74279
## 4           4 2774821   9.99030648 13.44400 11.495200  9.31480 30.69747
## 5           5 2774777   4.35597440 12.03429  9.015357 16.59286 36.73413
## 6           6 2774798   8.69344646 21.47179 35.293641 45.09015 48.59615
##      Sand5    Silt1    Silt2    Silt3    Silt4    Silt5    Clay1    Clay2
## 1 73.07354 49.00485 40.71233 15.42715 13.57983 14.06340 15.11615 26.54000
## 2 15.86000 71.64800 71.62823 70.58203 68.64867 69.06100 14.24500 14.37033
## 3 30.45305 70.65485 65.64850 65.46994 35.22879 54.23537 18.83225 24.82250
## 4 48.58000 68.94600 67.97213 66.78053 47.81453 35.96000 17.61000 20.53267
## 5 27.11190 68.71821 60.28321 34.02286 39.86052 41.12143 19.24750 30.70143
## 6  0.00000 45.78205 37.27431 36.66754 28.13462  0.00000 32.74615 27.43205
##      Clay3    Clay4    Clay5       OM1       OM2       OM3       OM4
## 1 14.73885 13.14719 12.86306 1.7208333 0.3348077 0.2575000 0.2519231
## 2 16.34950 17.06367 15.07900 2.1704167 1.9599167 2.7409167 1.8906667
## 3 24.02468 40.02842 15.31158 0.9820000 0.2550000 0.2550395 0.2535526
## 4 23.90467 21.48800 15.46000 2.7300000 1.6890000 0.5740000 0.4036667
## 5 49.38429 23.40536 31.76667 1.2584821 0.2544643 0.2500000 0.2500000
## 6 18.24231 13.01282  0.00000 0.8837179 0.2500000 0.2500000 0.2243590
##      OM5 Depth1 Depth2 Depth3 Depth4 Depth5
## 1 0.2500     39     40     40     40     40
## 2 0.2575     30     31     30     31     31
## 3 0.2500     40     40     41     40     41
## 4 0.2600     30     31     31     31     31
## 5 0.2500     28     29     28     29     29
## 6 0.0000     39     40     40     40     40

Vector Data Example

Perhaps we want to expand our our of concern

## buffer of 50 meters (because of our projection) around each well
wells_area = gBuffer(wells, byid = T, width = 50)
class(wells_area)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
soils@data = subset(soils@data, select = c("Sand1", "Silt1"))
head(over(wells_area, soils, fn = mean))
##      Sand1    Silt1
## 1 24.84412 58.12692
## 2 12.96008 68.01767
## 3 11.08520 69.97385
## 4 13.62663 68.09750
## 5 16.69081 56.99441
## 6 18.15888 53.44851

Outline 4

Part II

  • Scenario create map of %dem with indication of percent turnout.

  • Walk throug various iterations, raise problems then solve

  • plot just lines

  • introduce thematic mapping/choropleth mapping/data classification and coloring

  • Show classInt and creating the color vector

  • legend manipulation

  • creating centroids and scaling proportionally

Vector Data Plotting

SUBSET THE DATA HERE DAVE – LESS FIELDS

wards = readOGR("data", "WardData")
## OGR data source with driver: ESRI Shapefile 
## Source: "data", layer: "WardData"
## with 280 features
## It has 67 fields
names(wards@data)
##  [1] "NAME_x"    "ASM"       "SEN"       "CON"       "MCD_NAM"  
##  [6] "PERSONS_"  "WHITE"     "BLACK"     "HISPANIC"  "ASIAN"    
## [11] "AMINDIAN"  "PISLAND"   "OTHER"     "OTHERMLT"  "PERSONS1" 
## [16] "WHITE18"   "BLACK18"   "HISPANIC1" "ASIAN18"   "AMINDIAN1"
## [21] "PISLAND1"  "OTHER18"   "OTHERMLT1" "CNTY_NA"   "PRES_TO"  
## [26] "PRES_RE"   "PRES_DE"   "PRES_CO"   "PRESIND1"  "PRESIND2" 
## [31] "PRESIND3"  "PRESIND4"  "PRESIND5"  "PRESIND6"  "PRESSCA"  
## [36] "SEN_TOT"   "SEN_REP"   "SEN_DEM"   "SEN_IND1"  "SEN_IND2" 
## [41] "SEN_IND3"  "SEN_CON"   "SEN_SCA"   "CON_TOT"   "CON_REP"  
## [46] "CON_DEM"   "CON_IND"   "CON_SCA"   "SS_TOT_"   "SS_REP_"  
## [51] "SS_DEM_"   "SS_IND_"   "SS_SCAT"   "ASM_TOT"   "ASM_REP_" 
## [56] "ASM_DEM_"  "ASM_IND1"  "ASM_SCA"   "ASM_DEM2"  "ASM_IND2" 
## [61] "ASM_REP2"  "DA_TOT_"   "DA_REP_"   "DA_DEM_"   "DA_IND_"  
## [66] "DA_SCAT"   "DA_DEM2"











Vector Data Plotting

Scenario tasked with:

  • Create map of percent Democratic votes by ward

  • Also display percent turnout












Vector Data Plotting

Think about distribution and how to classify?

par(mfrow(3,1))
hist(wards@data$SEN_DEM, main="Distribution of Dem")












Vector Data Plotting

Vector Data Plotting

  • Scenario create map of %dem with indication of percent turnout.

  • Walk throug various iterations, raise problems then solve

  • Show classInt and creating the color vector

  • legend manipulation

  • creating centroids and scaling proportionally

Vector Data Plotting

options(stringsAsFactors=F)
source("./misc_scripts/function_proper_legend.r")
library(rgdal)
library(rgeos)
library(foreign)
library(classInt)
library(RColorBrewer)
library(scales)

wards@data$SEN_PERC_DEM = with(wards@data, SEN_DEM/SEN_TOT)
wards@data$SEN_PERC_TURN = with(wards@data, SEN_TOT/PERSONS1)

Vector Data Plotting

## defining number of classes
num_classes = 6
## the color palette
pal = brewer.pal(num_classes, "RdBu")
## the class intervals to use for the colors
class_ints = classIntervals(
    wards@data$SEN_PERC_DEM,
    num_classes,
    style="quantile")
## grab the colors for plotting
colrs = findColours(class_ints, pal)

Vector Data Plotting

## Custom legend formatting
source("./misc_scripts/function_proper_legend.r")
legtxt = properLegend(colrs, 2)
plot(wards,
     col=colrs,
     main="Senate 24",
     border=NA)

legend("topleft",
       legtxt,
       title="Proportion Democrat",
       fill=pal,
       bty='n'
)

Vector Data Plotting

Vector Data Modeling

Vector Data Modeling

Vector Data Modeling

wards_centroids = gCentroid(wards, byid=T) wards_centroids = SpatialPointsDataFrame( gCentroid(wards, byid=T), over(wards_centroids, wards) )

Vector Data Modeling

Outline 3

Part III

  • introduce ward data

  • scenario for modeling: predict %dem turnout

  • use regular linear regression, moran I the residuals

  • construct neighborhood weights using adjacency and distance, then construct model

  • take a look at results

Vector Data Modeling

library(spdep)
## Loading required package: Matrix
wards@data$SEN_PERC_DEM = with(wards@data, SEN_DEM/SEN_TOT)
wards@data$SEN_PERC_TURN = with(wards@data, SEN_TOT/PERSONS1)
wards@data$CON_PERC_DEM = with(wards@data, CON_DEM/CON_TOT)
wards@data$PRES_PERC_DEM = with(wards@data, PRES_DE/PRES_TO)

wards = subset(wards, !is.nan(wards@data$SEN_PERC_DEM))
neighborhood_binary = poly2nb(wards)
list_of_weights = nb2listw(neighborhood_binary, zero.policy=T)
moran.test(wards@data$SEN_PERC_DEM, list_of_weights, alternative="two.sided")
## 
##  Moran I test under randomisation
## 
## data:  wards@data$SEN_PERC_DEM  
## weights: list_of_weights  
## 
## Moran I statistic standard deviate = 15.295, p-value < 2.2e-16
## alternative hypothesis: two.sided
## sample estimates:
## Moran I statistic       Expectation          Variance 
##       0.567975167      -0.003623188       0.001396579
spat_lin_reg = spautolm(
    SEN_PERC_DEM ~ WHITE + BLACK + PRES_PERC_DEM + CON_PERC_DEM,
    data=wards,
    family="SAR",
    listw=list_of_weights)

Vector Data Modeling

  • scenario for modeling: predict %dem turnout

  • use regular linear regression, moran I the residuals

  • construct neighborhood weights using adjacency and distance, then construct model

  • take a look at results

Raster Data

Intro

A raster grid is rectangular.

Grid is another word for matrix.

Grid is another word for image.

A GIS raster grid is a matrix/image with an associated location and projection.

Intro

At a minimum, a GIS raster grid contains:

  1. matrix of values
  2. projection
  3. reference point, often (x,y) of the lower-left corner
  4. cellsize









Raster I/O

The rgdal rgdal packages is primarily for I/O and projecting GIS data

library(rgdal)

The raster package does everything rgdal does, but it includes lots of additional functionality.

library(raster)

Raster I/O

elev = readGDAL("data/dem_wi.tif")
writeGDAL(elev, "data/dem_wi_out.tif")
elev = raster("data/dem_wi.tif")
writeRaster(elev, "data/dem_wi_out.tif")

Raster data structure

The raster object elev has all the necessary pieces of spatial information:

elev
## class       : RasterLayer 
## dimensions  : 284, 387, 109908  (nrow, ncol, ncell)
## resolution  : 0.01666667, 0.01666667  (x, y)
## extent      : -93.03262, -86.58262, 42.3949, 47.12823  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## data source : /home/devans/Documents/GeRgraphyPresentation/data/dem_wi.tif 
## names       : dem_wi 
## values      : 175, 565.4104  (min, max)

Raster data structure

Which means we can make a map!

m = leaflet() %>%
  addTiles() %>%
  addRasterImage(elev, opacity=0.5)
m

Raster data structure

Raster analysis

Remember that rasters are just matrices!

Therefore, most matrix operations can be applied to rasters. For example:

plot(
  elev > 400,
  col=c("red", "blue")
)








Raster analysis

Rasters can be easily converted to matrices to do more complex work.

lat_grad = apply(
  as.matrix(elev),
  1,
  mean,
  na.rm=T
)
plot(lat_grad, type="l")






Raster overlay

Most raster analysis ultimately executes some sort of overlay.

The issue:

To overlay two or more rasters, their projections, extents, and cellsizes must align perfectly.

This can be a difficult task.

Raster overlay

coordinate systems

What is the highest point in each county?

# Pseudo-code
1. Read in elevation data (raster grid)
2. Read in county boundary data (polygons)
3. Convert counties to raster grid that aligns with elevation grid
4. Find maximum elevation gridcell within each county

Raster overlay

coordinate systems

counties = readOGR("data", "WI_Counties")
## OGR data source with driver: ESRI Shapefile 
## Source: "data", layer: "WI_Counties"
## with 72 features
## It has 7 fields
elev
## class       : RasterLayer 
## dimensions  : 284, 387, 109908  (nrow, ncol, ncell)
## resolution  : 0.01666667, 0.01666667  (x, y)
## extent      : -93.03262, -86.58262, 42.3949, 47.12823  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## data source : /home/devans/Documents/GeRgraphyPresentation/data/dem_wi.tif 
## names       : dem_wi 
## values      : 175, 565.4104  (min, max)

Raster overlay

coordinate systems

proj4string(counties)
## [1] "+proj=tmerc +lat_0=0 +lon_0=-90 +k=0.9996 +x_0=520000 +y_0=-4480000 +ellps=GRS80 +units=m +no_defs"
proj4string(elev)
## [1] "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"

Raster overlay

coordinate systems

extent(counties)
## class       : Extent 
## xmin        : 294839 
## xmax        : 770036.4 
## ymin        : 225108.8 
## ymax        : 734398.4
extent(elev)
## class       : Extent 
## xmin        : -93.03262 
## xmax        : -86.58262 
## ymin        : 42.3949 
## ymax        : 47.12823

Raster overlay

coordinate systems

cty_grid = rasterize(counties, elev, field="COUNTY_FIP")
summary(cty_grid)
##          layer
## Min.        NA
## 1st Qu.     NA
## Median      NA
## 3rd Qu.     NA
## Max.        NA
## NA's    109908

Raster overlay

coordinate systems

prj = proj4string(elev)
cty_prj = spTransform(counties, prj)
To do this, we use the spTransform function in the sp package.

Raster overlay

coordinate systems

extent(cty_prj)
## class       : Extent 
## xmin        : -92.88924 
## xmax        : -86.8048 
## ymin        : 42.49197 
## ymax        : 47.08077
extent(elev)
## class       : Extent 
## xmin        : -93.03262 
## xmax        : -86.58262 
## ymin        : 42.3949 
## ymax        : 47.12823

Raster overlay

coordinate systems

plot(elev)
plot(cty_prj, add=TRUE)














Raster overlay

coordinate systems

cty_grid = rasterize(counties, elev, field="COUNTY_FIP")
summary(cty_grid)
##          layer
## Min.        NA
## 1st Qu.     NA
## Median      NA
## 3rd Qu.     NA
## Max.        NA
## NA's    109908

Raster overlay

coordinate systems

extent(cty_grid)
## class       : Extent 
## xmin        : -93.03262 
## xmax        : -86.58262 
## ymin        : 42.3949 
## ymax        : 47.12823
extent(elev)
## class       : Extent 
## xmin        : -93.03262 
## xmax        : -86.58262 
## ymin        : 42.3949 
## ymax        : 47.12823

Raster overlay

coordinate systems

library(dplyr)
ovly = data.frame(
  elev=getValues(elev),
  cty=getValues(cty_grid)
)

hi_pt = ovly %>%
  group_by(cty) %>%
  mutate(
    elev = (elev == max(elev, na.rm=T)) * elev
  ) %>%
  ungroup()

elev = setValues(elev, hi_pt[["elev"]])
elev[elev == 0] = NA
hi_pt_sp = rasterToPoints(elev, spatial=T)

Raster overlay

coordinate systems